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Summary 


This  paper  addresses  the  problem  of  estimation  of  the  parameters 
of  the  Poisson  sum  of  Gaussian  random  variables  imbedded  in  a  background 
of  Gaussian  noise  when  only  realizations  of  the  sum  are  observable. 
Cumulant  matching,  maximum  likelihood,  and  an  empirically  orthogonalized 
characteristic  function  procedures  are  considered.  The  characteristic 
function  and  the  maximum  likelihood  procedures  produce  similar  results 
in  a  simulation  study.  However,  the  characteristic  function  procedure 
is  computationally  superior.  Conditions  under  which  all  procedures  are 
incapable  of  parameter  estimation  are  discussed. 
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1.  Introduction 


Let  the  random  variable  ZQ  have  a  normal  distribution  with  zero 
2 

mean  and  variance  o  ;  denote  by  Z^Z^,...,  a  sequence  of  variates  which 

2 

are  identically  normally  distributed  with  means  y  and  variances  a  .  The 
Zy  j=0,l,2,...,  are  taken  to  be  independent  of  one  another  and  also  of 
the  discrete  random  variable  N,  which  has  a  Poisson  distribution  with 
parameter  X.  The  problem  of  concern  in  this  paper  will  be  the  estimation 
of  the  parameters  9  =  (yjX.o^o*)'  when  only  realizations  of  the  sum 

N 

X  =  Z  t  l  21  (l.D 

k=l 

are  observable. 

The  topic  will  be  motivated  in  this  section  by  a  brief  account  of 
a  certain  security  price  model  proposed  by  Press  (1967; 1968)  which  leads 
to  the  estimation  problem  at  hand.  Models  of  this  type  occur  frequently 
in  communications  engineering  and  can  also  be  categorized  as  being  of 
the  cumulative  damage  or  asset  flow  type,  so  that  it  is  likely  to  oe  of 
interest  in  a  wide  variety  of  possible  applications. 

The  fundamental  assumptions  of  the  price  fluctuation  model 
advocated  by  Press  may  be  summarized  by  supposing  that  the  net  increase 
or  decrease  in  value  of  a  security  over  a  given  time  interval  may  be 
represented  as  a  random  sum  of  independent  price  changes  superimposed  on 
an  independent  process  of  background  noise.  Each  price  change  is  triggered 
by  the  arrival  of  some  "information  event,"  which  occurs  from  time  to  time 
in  accordance  with  a  Poisson  proc.ess  N(t)  having  parameter  X.  The  logged 
price  of  the  security  (which  should  be  adjusted  to  compensate  for  stock 
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splits,  divided  payments,  and  so  forth)  at  some  time  t  can  then  be 
characterized  by  the  equation 


N(t) 

P(t)  =  P„  +  l  Z  +  Y(t),  t>0,  (1.2) 

0  k=l 

where  PQ  is  the  initial  log-price  at  the  base  time  t=0,  the  are  inde¬ 
pendent  random  variables  representing  price  changes  due  to  the  occurrence 

of  information  events,  and  Y(t)  is  the  background  noise  process,  Y(0)=0. 

2 

Press  takes  the  to  be  normal  with  mean  u  and  variance  and  supposes 

2 

that  Y(t)  is  a  Wiener  process  with  parameter  o.,  so  that  Y(t)  has  station¬ 
ary  and  independent  increments,  and  for  any  t>0  Y(t)  has  a  normal  distri- 

2 

bution  with  zero  mean  and  variance  a^t.  The  processes  N(t)  and  Y(t)  are 
assumed  to  be  independent  of  one  another  and  of  the  Z^.  P(t)  represents 
the  log-price  of  the  security  rather  than  the  price  itself  primarily  to 
account  for  the  empirically  justifiable  belief  that  the  variation  of  price 
change  should  be  positively  related  to  the  magnitude  of  a  security's  value. 

Security  price  data  are  typically  compiled  at  regular  time 
intervals,  whose  length  may  be  taken  to  be  one  unit  without  loss  of  gen¬ 
erality.  Then,  letting  Xt ,  t=l,2,...,  represent  the  change  in  log-price 
of  the  security  in  the  interval  (t-l,t],  it  follows  by  differencing  equation 
(1.2)  that 

N(t) 

X=  £  Zk  +  Zo  -  t=l,2,...,  (1.3) 

r  k=N( t-1 )+l  ’ 
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X  of  equation  (1.1).  Furthermore,  the  log-price  changes  and  X^ ,  are 
independent  for  t^t'  since  the  processes  Y(t)  and  N(t)  have  independent 
increments  and  the  random  variables  Z^Z^....,  have  been  assumed  indepen¬ 
dent.  If  a  realization  of  the  process  P(t),  O^tdn  is  available,  then  the 

2  2 

problem  of  estimating  its  parameters  6  =  (u,A may  therefore  be 

reduced  to  the  problem  of  estimating  the  parameters  of  the  distribution 

associated  with  the  variate  X  of  (1.1),  based  on  the  random  samDle  X„,X„,...,X  . 

12  n 

Empirical  investigations  of  Fama  (1965)  indicate  that  the  distri¬ 
bution  of  log-price  changes  should  possess  thicker  tails  and  be  more  peaked 
about  some  measure  of  central  tendency  than  would  be  permitted  by  a  Gaussian 
distribution.  Press'  compound  events  model,  represented  by  equations  (1.2) 
and  (1.3),  can  be  shown  to  possess  these  properties  (Press,  1968).  It  is 
also  in  general  skewed,  a  property  which  some  empirical  evidence  suggests 
may  be  appropriate  (Fielitz  and  Smith,  1972;  Leitch  and  Paulson,  1975). 


2.  Estimation  by  Cumulant  Matching 

By  a  conditioning  argument,  it  is  easy  to  show  that  the  distribution 
function  associated  with  the  random  variable  X  of  (1.1)  is 


(2.1) 


is  the  distribution  function  of  a  standard  normal  deviate;  the  corresponding 


-4- 


(2.2) 


(2.3) 


The  cumulants  of  the  distribution  may  be  found  by  developing  the 
cumulant  generating  function  log  $(u;6)  in  powers  of  u.  We  shall  require 
the  first  four  of  these: 


<1  =  Xu 

2  2  2 
<2  -  o1  +  x(u  +  a2) 

2  0  '  ( 

*3  =  Xu(y  +  3ct2 ) 

4  2  2  4 

<4  =  X ( u  +  6u  a2  +  3o2  ) 

Let  X1>X2>...,Xn  be  a  random  sample  drawn  from  (2.1).  Generally  there 

will  be  no  need  to  explicitly  consider  the  underlying  process  ?(t)  of 

equation  (1.2)  which  may  have  generated  the  sample. 

Since  the  density  f(x;0)  of  (2.2)  has  no  simple  closed  form,  it 

can  be  seen  that  the  method  of  maximum  likelihood  may  not  provide  a 

computationally  attractive  solution  to  the  problem  of  estimation  of 
2  2 

(u,X,o1,o2).  For  this  reason  and  because  of  the  simplicity  or  (2.4), 
Press  (1967;1963)  has  suggested  a  cumulant  matching  procedure. 
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Let  <y  j  =  1 , 2 , 3 ,4 ,  represent  the  first  four  cumulants  of  the 
sample  {X^ ,X2 , . . . ,Xn} .  These  are  related  to  the  sample  mean  X  and  the 


1  -  v 

central  moments  m  =—  7  (X.-X)v  by 

v  n  ] 


k  =X,  <  =m. ,  ic  =m_ ,  <  -m  -3m  . 


(2.5) 


Equating  the  sample  cumulants  to  the  respective  population  cumulants 
given  in  equation  (2.4)  yields  a  system  of  four  equations  in  the  four 
unknown  parameters.  After  some  reduction  the  system  may  be  written  as 


4  2*3. 

u  -  (  - - J 

*1 


2  .3  S, 

u  +  (g-  — )  u 

2  "l 


A  2 

('i2) 

2*1 


=  0 


<^/v 


(2.6) 


a 


2 

2 


3k 


1 


2  -  /l,  ,  2  2, 

a  =  k0  -  ( — )  (u  +  o  ) . 

1  2  a  2 

-  -  -  -2  -2 

Cumulant  matching  estimates  6  =  (y  ,A  ,a.  ,a_  )’  may  then  be  defined  by 

requiring  that  they  satisfy  the  system  of  equation  (2.6).  The  quart ic 

equation  has,  of  course,  four  roots,  real  or  complex;  in  every  case  to  which 

this  procedure  has  been  applied,  it  has  been  found  that  exactly  two  of 

these  are  real,  and  are  of  opposite  sign.  The  root  tc  which  should  be 

equated  is  then  that  real  root  which  causes  the  intensity  parameter  esti- 

*1  .  .  .  ... 
mate  X  =  - —  to  be  Dositive;  that  is  <„  and  u  shoulc  be  ot  similar  sign, 
n  u  In 

n 

The  sample  cumulants  k4  are,  apart  from  not  unbiased  esti- 

j  x 

mators  of  the  corresponding  population  cumulants,  although  they  are  of 
course  consistent.  They  have  therefore  been  replaced  in  (2.6)  by  the 
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first  four  of  Fisher's  k-statistics ,  which  are  unbiased  (Kendall  and 
Stuart,  I,  1969,  p.  281). 

The  most  attractive  feature  of  cumulant  matching  is  its  ease  of 
application;  one  has  only  to  compute  the  first  four  cumulants  (or  k-statis¬ 
tics)  of  the  sample  and  then  solve  a  quartic  equation  to  obtain  estimates 
which  have  the  desirable  properties  of  consistency  and  asymptotic  normality. 
Unfortunately,  such  estimators  appear  to  possess  rather  low  efficiencies. 

In  an  analysis  of  the  price  fluctuations  of  ten  securities  used  in  computing 
the  Dow  Jones  Industrial  average.  Press  (1967)  found  that  the  use  of  the 

cumulant  matching  method  led  to  infeasible  estimates  (with  either  a  or 
_  2 

a9  negative)  in  every  case.  After  the  infeasible  parameter  estimate  was 
set  equal  to  zero,  he  furthermore  found  that  the  distribution  function 
computed  by  substitution  of  the  estimated  parameter  values  into  equation 
(2.1)  gave  a  visibly  poor  fit  when  graphically  compared  to  the  empirical 
distribution  function  associated  with  the  sample  X„  ,X,, , . . .  ,X  .  These 
estimates  were  based  on  sample  sizes  ranging  from  1S5  to  *499,  and  Press 
concluded  that  much  larger  sample  sizes  are  required  to  achieve  reasonable 
estimates  by  cumulant  matching.  This  is  consistent  with  the  simulation 
results  to  be  subsequently  presented  which  indicate  that  for  samples  of 
size  500,  cumulant  matching  is  totally  inadequate,  at  least  in  those 
portions  of  the  parameter  space  that  were  considered. 
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3.  Characteristic  Function  Estimators 

3e cause  the  characteristic  function  $( u,9)  associated  with  the 
modified  compound  Poisson  distribution  has  a  reasonably  simple  form, 
it  was  believed  that  a  method  of  parameter  estimation  based  on  character¬ 
istic  functions  might  provide  estimators  for  the  parameter  vector 

9  2 

9  =  ( y  ,a~  ,o- ) '  with  reasonable  efficiency  and  computational  tractability . 
Estimates  for  the  true  parameter  vector  9^  can  be  obtained  by  numerically 
determining  the  zeros  cf 

n  .oo 

S  =  2  Re  7  3 ^  ' |S-  (Mu;9)  -  expiux .  )*]  ’p(  <p(  u;9 ) )  |  z  du  =  0  (3.1a) 

n6  3o  -  j  - 

J  —CO 

2  2 

for  9  =u,A,d  ,a~  and  some  function  •;/(•).  Clearly  E(S  )  =  0  under  mild 
1  z  nt? 

regularity  conditions  and  we  can  expect  that  the  M-estirr.ators  derived  from 
this  system  or  its  discrete  counterpart, 

no  3  4>  ( u  .  ;  9  ) 

S  =  2  Re  y  l  -  ~~  ($(u„;9)  -  expiu  x .  )••  |  ’P(  $(  u  ;9)|  =  3  (3.1b) 

a9  at1  '  -L  3 

2  2. 

tor  some  p  and  6  =  A,u,a,,a2>  will  re  consistent  and  asymptotically  normal 
(Thornton  and  Paulson,  1977).  An  appealing  feature  of  (3.1)  is  that  the 
weight  function  adapts  itself  to  the  data  under  the  purview  of  the  assumed 
model,  in  this  case  the  modified  compound  Poisson  distribution.  Apart 
from  the  weighting,  equations  (3.1)  are  very  similar  in  form  to  the 
normal  equations  of  nonlinear  least  squares 

r  9  e  " 

\  — — -r  ~  ~  (expected  -  observed)  =  9, 


for  parameter  0.  The  observed  terms  are  replaced  by  expiux.  and  the 

expected  terms  are  replaced  by  <t>(u;0).  It  thus  makes  sense  to  choose 

the  weights  \ji($(u^;0))  in  inverse  proportion  to  the  standard  deviations 

of  the  (<$>(u  ;9)  -  expiu„x.}.  However,  the  residuals  {$(u  ;0)  -  exciu  x.} 
^  '  4.  1  4  '  ‘  G.  I 

are  not  uncorrelated.  It  will  therefore  often  be  advantageous  to  make 
use  of  this  correlation  and  we  shall  do  so  presently. 

It  will  be  more  convenient  to  work  with  the  quantity 


y(u;9)  =  P.e  3>(u;0)  +  Im  $(u;6). 


(3.2) 


and  its  sample  estimate 


v  (u)  =  Re  $  (u)  +  Im  $  (u) 
n  n  n 


(3.3) 


(u)  =  n  1  l  expiux j  =  n  1  £  (cos  ux4  +  i  sin  ux. ) 


(3.4) 


instead  of  b(u;§).  Clearly  E(J>  (u))  =  $(u;§)  and  hence  E(y  (u))  =  y(u) 
for  ail  u.  It  is  easy  to  show  that  the  covariance  kernel  of  the  real 
orocess  v  (u)  is  given  by 

.  n 


',(u,v’>  =  n  cov( y  (  u)  ,y  ( v ) ) 
n  n 

=  Re  j ( u- v ; 9 )  +  Im  $(a+v;0)  -  [Re  j(u;S)  +  I-  ?(u;r)][Re  v;§ ) 


+  Im  j ( v ; 5 ) ] 


(3.5) 


(lee  also  Bryant,  unbpubiished  Ph.D.  dissertation,  Per.sseiaer  Polytechnic 
Institute,  1977.)  The  residuals  y  (u)  -  y(u;3)  have  covariance  kernel 
n  1  K(u,v).  Tefine 


in  =  £\,(uo)»  •••*  y 


(3.5) 


7(9)  =  (y( u1 ,9  I ,  y(u  ,0),  ...  y(u^,9 ) ) ‘ 


(3.7) 
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where  u^,  u^,  have  been  chosen  so  that  the  matrix  K  is  positive 
definite.  It  is  always  possible  to  choose  such  u^,  u^,  u^  since  t 
covariance  function  K(u,v)  is  positive  definite  (Feller  1963,  Ch.  XIX). 
The  p*l  random  vector  z(9)  =  K  2(y  -  y(9))  has  covariance  matrix  I. 

The  estimation  9  may  be  effected  through  consideration  of  the  objective 
function 


he 


Q(9)  =  (y  -  y(9))'K  1(y  -  y(9))  = 


-n 


q=l  q'=l 


(yn(uq)-y(uq,9))(yn(uqt )-y(uq, ,9))k 


qq' 


(3.8) 


nq  *  —  1 

where  the  k^  ,  q,  q'=l,2,...,p,  are  elements  of  the  inverse  matrix  K 

Since  (y  -  y(9))  is  asymptotically  p-dimensional  Gaussian,  G(9) 

2 

is  asymptotically  x  on  p  degrees  of  freedom.  Accordingly,  estimation  of 

2  2  2 
the  parameters  y,X,a1,o2  by  way  of  Q(6)  is  approximately  a  x  minimum 

procedure  and  can  be  expected  to  be  quite  efficient.  In  fact,  this  has 

been  independently  and  recently  shown  by  Feurverger  and  McDunnough  (1981). 

The  function  Q(9)  depends  on  the  unknown  value  8Q  of  the  parameters  vector 

through  the  matrix  K.  Estimation  could  still  be  effected  by  regarding 

;<  as  a  function  of  the  minimizing  variable  9  but  such  a  procedure  would 

require  an  inversion  of  the  matrix  at  each  iteration  of  the  minimizing 

algorithm  and  so  would  lead  to  computational  expense.  An  alternative  to 

direct  minimization  of  the  x^  -  like  statistic  is  to  proceed  in  stages 
.  .  .  2  .  . 

via  a  modified  x  minimum  procedure  where  the  matrix  K  is  held  constant 
during  the  differentiation  stage  and  allowed  to  be  variable  thereafter. 
Instead,  we  use  the  fact  that  the  vector  is  the  mean  of  the  independent 
and  identically  distributed  random  vectors  s^,  j=l,2,...,n,  whose  elements 


are 
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s.  =  cos(u  X.)  +  sin(u  X.),  q=l,2,...,p, 

iq  q  ]  q  j 

where  X^X^.-.jX^  is  the  sample  drawn  from  the  population  whose  para¬ 
meters  are  to  be  estimated.  The  matrix  K  in  (3.3)  may  be  replaced  by 
the  sample  covariance  matrix  i<n  with  the  general  element 

1  2 

k,  =  — -  l  (s.  -y  (u  )}{s.  ,-y  (u  )}. 
qq  in  n-1  :q  Jn  q  ]q'  Jn  q 

Thus  the  characteristic  function  estimates  9^  may  be  generated  by  minimizing 
over  9  the  sum 


on(0)  =  l  z  (y>J-y(u -,0))(y  (u  t)-y(u Ql,e))kJq 

n  q  nq  q  n 


(3.9) 


Aqq  *  —  1 

where  the  knH  are  elements  of  K  .  After  such  estimates  are  obtained, 
n  ~n 

they  may  be  refined  by  using  equation  (3.5),  evaluated  at  the  estimated 

parameter  values,  to  re-approximate  the  covariance  matrix  K;  then  a 

second  minimization  step  may  be  performed. 

The  algorithm  outlined  in  the  preceding  paragraph  has  been  applied 

to  simulated  data  having  the  modified  compound  Poisson  distribution  through 

the  ’use  of  a  simplex  minimization  procedure  (Jacoby,  Kowaiik  and  Pizzo,  1972, 

o  2 

p.  79)  applied  to  equation  (3.9),  where  the  variables  \,  and  were 

x  2 

replaced  by  their  logarithms  to  result  in  an  unconstrained  problem.  Discus¬ 
sion  of  the  performance  of  the  estimation  procedure  will  be  deferred  until 
after  likelihood  is  discussed.  A  total  of  p=40  points  u  were  used,  twenty 
placed  symmetrically  on  either  side  of  the  origin;  the  effect  of  their 
placement  was  not  extensively  studied,  but  did  not  appear  to  be  too  crit¬ 
ical,  so  long  as  several  points  were  always  included  near  the  origin.  Since 
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4>( - ,6 ) ,  where  <2  •'•s  t^e  variance  of  the  distribution  given  by  (2.2),  cor¬ 


responds  to  random  variables  with  unit  scale,  it  seems  reasonable  to  use 


the  quanity  - ,  where  <  is  the  sample  variance,  as  a  unit  of  measurement 


in  determining  the  placement  of  the  u  .  As  a  rule  of  thumb,  placing  the 

q 

two  points  closest  to  the  origin  at  t  — - —  and  then  gradually  increasing  the 


interval  between  consecutive  points  as  |u|-*<*>  appears  to  work  reasonably  well. 
Figure  1  provides  an  indication  of  the  agreement  of  the  sample  transform 
y  (u)  with  the  theoretical  transform  y(u)  for  a  sample  of  size  500.  We 
thus  expect  to  do  reasonably  well  in  estimating  the  parameters  from  the 
data  with  p=40. 

In  retrospect ,  it  is  believed  that  p  could  probably  have  been 

chosen  to  be  somewhat  less  than  40  without  seriously  degrading  the 

resultant  estimators.  If,  however,  the  value  p=40  is  used  in  equation 

(3.9),  it  will  be  found  that  the  sancie  covariance  matrix  K  will  be 

~n 

quite  illconditioned.  Its  inversion  may  therefore  sometimes  prove  to  be 
numerically  troublesome.  A  moderate  (5  to  10  percent)  inflation  of  its 
diagonal  elements  will  alleviate  this  difficulty,  and  seems  to  have  no 
harmful  effect  on  the  estimates. 

There  are  theoretical  difficulties  associated  with  the  fact  that 
the  function  0^(9)  of  equation  (3.9)  measures  the  deviation  of  the  empirical 
function  '/^(u)  from  the  theoretical  function  y(u,9)  at  only  a  finite 
number  of  points.  Although  y(u,9)  corresponds  uniquely  to  the  character¬ 
istic  function  4>(u,9)  and  so  uniquely  determines  the  distribution  function 
F(x,9)  of  equation  (2.1)  it  may  happen  that  two  distinct  feasible  para¬ 
meter  vectors  9.  and  9„  satisfy  y(u  ,9  )  =  y(u  ,9  )  at  each  of  the  points 


u1,u2>...,Up  even  though  this  cannot  be  the  case  identically  in  u.  This 
is  clearly  undesirable  since  it  implies  that  the  estimation  procedure  is 
incapable  of  differentiating  between  samples  drawn  from  the  distinct  dis¬ 
tributions  F(x;01)  and  F( x; §2 ) .  Fortunately,  from  a  practical  point  of 
view  this  phenomenon  causes  no  difficulty  as  long -as  u]_’u2»  •  •  •  »up  are  not 
too  widely  spaced,  because  then  the  vectors  9^  and  9 must  be  greatly 
separated  in  the  parameter  space. 

In  order  to  make  this  last  statement  more  precise,  consider  as 
a  simple  example  the  use  of  just  six  points  located  on  the  u-axis  at  ±  e, 

±  2e  and  ±  3e,  where  e>0.  Then  by  straightforward  algebra  it  can  be  shown 

that  the  six  values  y(u,9),  u  =  ±  £ ,  ±  2e ,  ±  3e ,  uniquely  determine  the 

2  2 

four  elements  of  the  parameter  vector  0  =  (u,X, o^c^)'  if  9  is  further 
assumed  to  lie  in  the  reduced  parameter  space  0R  =  { 0 1  |u|  < 

X,o1,O2>0}.  To  continue  with  this  example,  suppose  the  grid  size  c  is 
chosen  equal  to  — - — ,  which  is  the  position  of  the  smallest  positive  point 

according  to  the  previously  mentioned  rule  of  thumb,  and  suppose  further 

that  the  sample  variance  estimates  <2  essentially  without  error.  Then, 

2  2 

letting  =  ( uQ , XQ ,o^0  ,a20 )  denote  the  true  values  of  the  parameters, 


and 


( 3. 10) 


(3.11) 


Thus  9g  will  lie  in  0R  as  long  as  XQ  is  in  the  interval 
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i 


X.  e  ^JL-)  =  (0.006,  17.54) 

0  16ir2  9 


( 3.12) 


since  then,  from  equations  (3.10)  and  (3.11) 


For  XQ  in  the  interval  (3.12),  the  fact  that  the  values  of  y(u,9) 
at  the  six  points  u=±e,  c2e,  ±3e  completely  specify  9  within  © _  allows 
one  to  construct  a  proof  of  the  strong  consistency  of  the  estimates  obtained 
by  minimizing  0^(9)  of  equation  (3.9)  over  any  compact  subset  of  0R 
containing  the  true  parameter  vector.  The  proof  is  totally  analogous  to 
that  given  by  Bryant  and  Paulson  (1979).  If,  on  the  other  hand,  XQ  does 
not  lie  in  the  interval  (3.12)  effective  simultaneous  estimation  of  all 
four  parameters  is  a  practical  impossibility  no  matter  what  method  nay  be 
used.  The  reasons  for  this  phenomenon,  which  have  to  do  with  the  insen¬ 
sitivity  of  the  distribution  function  F(x;9)  to  its  parameters  for  extreme 
values  of  X,  will  be  subsequently  discussed. 

The  simple  example  of  the  preceding  paragraphs  is  not  meant  to 
imply  that  only  six  u-values  should  be  used  in  the  computation  of  the 
objective  function  0(0),  or  even  that  u^  ,u^ , . .  .  ,ur^  should  be  necessarily 
equally  spaced.  Rather,  it  is  intended  to  at  least  partially  justify  the 
empirical  observation  that,  even  though  the  measurement  of  the  deviation 
between  the  functions  y  (u)  and  y(u;6)  at  only  a  finite  number  of  points 
poses  theoretical  difficulties,  these  should  not  disqualify  the  proposed 
estimation  procedure  from  practical  consideration. 
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4.  Maximum  Likelihood  estimators 


The  probability  density  function  of  the  modified  compound  Poisson 
distribution  is  given  in  equation  (2.2)  and  may  be  expressed  as 


oo 

f(x)  =  y  p  f  (x> 
q  q 
q=0  ^  M 

where 

d  =  e  *  x'Vq! 

q 

and 

-  (x-qu)2 

2(a^+q&2 ) 


f  (x)  = 

q 


(2-n(o2+qo'l)  )** 


exp 


In  these  equations  the  dependence  of  f(x),  o  and  f  (x)  on  the  parameters 

•  q  q 

has  been  suppressed  for  later  convenience.  A  system  of  maximum  likelihood 
equations  can  be  formed  by  differentiating  the  log-likelihood  function 


L(u,X, cr?,c^)  =  l  log  f(X.  ) 

A  ‘  jsl 

2  2 

with  resoect  to  the  oarameters  u,  X,  o,,  o„,  and  setting  toe  resulting 

l  i 

expressions  equal  to  zero.  This  gives 


3L 

3u 


l 

j 


1 


f(X.) 

] 


q  o  +qo 


2 

2 


(X, 


1 


3L  1_  r  1  r 
3X  =  X  k  f(X.  )  “  ”'?q 

1  J 


f  (X.) 

q  i 


n  =  0 


(4.1a) 


(4.1b) 
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(  m-t-1) 


y{[  -  .  2u  y_ 

i  fTxT  .4,  .2  w  4f 

]  --(y+q) 


p  f  (X. ) 

- r  n  n  ? 


X.  qp 

‘  t — la 

on  4,  ^  ,2 

3  3  <uy+q) 


f  (X.)  +  u‘ 

q  3 


I*!?- r  I- 


. f(X. )  2 

]  ]  q(yfc) 


(  X  .  ) } 


,  ,  ?„f  (x.) 

.j  ^  y  — ; —  y  _i_S _ ] _ 

2  ^  f(X.)  £  ( Y  +q ) 


( h. 2d) 


where  the  right  hand  side  of  (4.2)  is  evaluated  at  the  mth  values  of  the 
parameters . 

There  are  an  almost  unlimited  number  of  ways  to  transform  the 
equations  (4.1)  into  a  form  suitable  for  a  fixed  point  procedure;  the 
form  of  these  equations  is  partially  motivated  by  heuristic  considerations. 
It  has  net  been  possible  to  prove  that  this  fixed-point  scheme  must  con¬ 
verge  or  yield  unique  solutions  but  the  algorithm  has  consistently  yielded 

reasonable  parameter  estimates  when  applied  to  simulated  data. 

„  ,  (m).(m)  2(m)  ,  ( m ) 

.he  convergence  of  the  iterates  y  ,  A  ,  c?  and  y  or  system 

(4.2)  is  unfortunately  extremely  slow,  so  that  some  sort  of  acceleration 

modification  is  a  practical  necessity.  An  adaption  of  the  Aitken  1"  process 

(Hildebrand,  1974,  pp.  567-71)  has  proven  effective.  A  detailed  description 

is  given  in  Bryant  (unpublished  ?h.2.  Thesis,  Rensselaer  Polytechnic 

Institute,  1977). 

The  major  disadvantage  of  using  the  fixed-point  algorithm  to 
obtain  maximum  likelihood  estimates  for  the  parameters  of  the  modifier 
compound  Poisson  process  is  the  inordinate  amount  :f  computer  time 
required.  For  samples  of  sice  529  the  maximum  likelihood  procedure  took 


from  two  to  six  times  as  long  as  the  characteristic  function  procedure, 
depending  on  the  values  of  the  parameters  chosen  (the  amount  of  time 
required  increased  rapidly  with  X).  Furthermore,  the  time  required  by 
the  maximum  likelihood  method  increased  with  increasing  sample  size,  so 
that  for  very  large  data  sets  its  use  may  not  be  considered  economically 
feasible.  This  is  not  the  case  with  the  characteristic  function  method, 
as  the  time  it  requires  is  primarily  a  function  of  p.  As  previously 
stated,  the  value  of  p  used  in  the  estimations  reported  here  was  40, 
which  was  quite  probably  excessive;  thus  it  might  be  possible  to  reduce 
the  amount  of  computer  time  required  by  this  procedure  without  loss  in 
the  accuracy  of  estimation. 

5.  Empirical  Comparisons 

Estimates  of  the  parameters  of  simulated  modified  compound  Poisson 
samples  of  size  500  are  tabulated  in  Table  1,  and  may  be  used  to  at 
least  partially  evaluate  the  relative  desirability  of  the  cumulant  matchin 
characteristic  function  and  maximum  likelihood  estimation  procedures.  In 
view  of  the  considerable  amount  of  computer  time  required  by  the  character 
istic  function  and  maximum  likelihood  algorithms,  the  number  of  different 
combinations  of  parameter  values  investigated  was  necessarily  rather  small 
in  all,  the  results  of  25  simulated  samples  are  contained  in  the  table. 

These  data  clearly  indicate  that  cumulant  matching  estimates  are 
noticeably  less  efficient  than  those  provided  by  the  other  two  procedures. 
In  fact,  in  the  majority  cf  cases  it  was  found  that  the  cumulant  matching 
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method  gave  infeasible  solutions  in  which  one  or  both  of  the  estimates  of 

2  2 

the  variance  parameters  and  a ^  were  negative.  This  is  consistent  with 
the  results  obtained  by  Press  in  his  attempt  to  fit  security  price  data  with 
the  modified  compound  Poisson  distribution  through  use  of  this  method  of 
parameter  estimation. 

The  characteristic  function  estimates  and  those  obtained  by 

maximum  likelihood  are  highly  correlated,  and  for  most  of  the  25 

samples  in  Table  1  yield  solutions  of  nearly  equal  quality.  Their 

comparison  is  made  more  difficult  by  the  apparent  fact  that  for  some  of 

the  combinations  of  parameters  considered  (notably  whenever  p=0  and  X>1), 

sample  sizes  considerably  larger  than  500  are  required  for  truly  effective 

estimation  by  any  means.  Accordingly,  it  would  have  been  desirable  to 

simulate  larger  samples  with  which  to  compare  these  procedures.  This, 

however,  was  determined  to  be  inadvisable  due  to  the  excessive  amount  of 

computer  time  it  would  have  required.  Instead,  the  parameters  of  some 

of  the  distributions  were  re-estimated  using  the  same  data  as  that  upon 

which  Table  1  is  based,  where  in  addition  it  was  assumed  that  the  ratio 
.  2  2 

of  the  variance  parameters  o.  and  a,,  was  known.  This  additional  infor¬ 
mation  increases  the  precision  of  estimation  of  both  procedures,  and  also 
substantially  reduces  the  length  of  time  required  by  either  algorithm. 

The  resulting  estimates  are  recorded  in  Table  2,  along  with  those 
calculated  from  several  samples  of  size  500  not  -ncluded  in  the  first  data 
set.  Conclusions  similar  to  those  drawn  on  the  basis  of  Table  1  are 
supported  by  these  data.  Again,  characteristic  function  and  maximum  likeli¬ 
hood  estimators  appear  to  do  about  equally  well  (an  exception  occurs  in 


-13- 
TABLE  1 

Comparison  of  Cumulant  Matching  (CM),  Characteristic  Function  (CF) 
and  Maximum  Likelihood  (MLE)  Estimates  of  the  Parameters 
of  the  Modified  Compound  Poisson  Distribution 


u 

X 

ai 

C2 

y 

\ 

°i 

a2 

Parameters 

4 

H 

1 

1 

2 

1 

1 

CM 

5.145 

0.403 

0.248 

-1.050 

3.194 

0 . 343 

0.737 

-0.408 

CF 

3.912 

0.5  36 

0.992 

1.517 

2.592 

3.431 

0 . 970 

0.719 

MLE 

3.863 

0.534 

0.987 

1.509 

2.599 

0.429 

0.354 

0.560 

Parameters 

1 

h 

1 

1 

0 

75 

1 

1 

CM 

2.233 

0.215 

1.096 

-0.159 

-0.354 

0.170 

1.331 

1.424 

CF 

1.395 

0.338 

1.210 

0.841 

-0.139 

0.329 

1.079 

1.539 

.  MLE 

1.429 

0.332 

1.193 

0.738 

-0.123 

0.535 

1.061 

1.020 

Parameters 

4 

1 

X 

1 

2 

4 

X 

1 

1 

CM 

3.488 

1.121 

1.574 

3.154 

3.233 

0.682 

0.537 

-1.452 

CF 

3.356 

0.39  3 

1.079 

1.435 

1.352 

1.620 

0.517 

2.157 

MLE 

4.011 

0.975 

1.192 

1.313 

1.396 

1.578 

0.569 

2.097 

Parameters 

X 

l 

1 

4 

X 

4 

'h 

1 

::i 

1.25? 

0.583 

1.309 

3.  "20 

u .  1 35 

1.943 

-0.244 

4  —  4 

CF 

1.0  33 

3.325 

C  .  9  ^  7 

3.905 

4.122 

1.346 

1 . 302 

0 . 4 to 

MLE 

1.179 

3. "27 

3.333 

0.904 

4  a  1 7 

1 .  944 

1.254 

0.513 

Parameters 

2 

2 

1 

1 

1 

-> 

<- 

1 

1 

CM 

3.773 

5 .213 

-5.515 

-2.505 

2.292 

0.901 

0.55" 

-0. '53 

CF 

2.375 

1.695 

0.520 

0.135 

0 . 

2.113 

1.053 

^  .  _,^4 

"LI 

2.371 

1.7C1 

0.579 

0.153 

4  4  —  - 

1 .  "55 

1.15  5 

0.  :2" 

Parameters 

u 

3 

1 

1 

3 

. 

4 

IM 

5.063 

1.231 

-13.300 

-5.022 

1 '  - 

_  .  3a 

a  3 

-•  -u 

rr 

3.941 

3.359 

C  .  9  6  ^ 

1 .  cc ^ 

2.-22 

2  .  -  -  3 

1.7:9 

n  m  -j 

MLE 

3.316 

3.150 

3.730 

1.153 

: .  -"o 

4  hj  < 

2  .  2  39 

Parameters 

1 

3 

4 

1 

V 

4 

CM 

1.15 

2.54 

0.  "25 

1.22 

4  .  5  9  2 

2.4-12 

-0.006 

- 1 .  3  a  1 

CF 

1 . 145 

2 . 543 

1 .234 

0.341 

4.09  3 

1  3 

0.2  33 

7. ""3 

MLE 

1.207 

2.418 

1.339 

0.896 

u .  2*  a 

7  .  -"2 

0.242 

0  .  ?  1  4 

Parameters 

2 

’-j 

V 

1 

2 

V 

, 

CM 

2.525 

3.332 

m  m.  - 

-0.415 

1 .  :~9 

?  .  -2  1 

*>  *  -  -4 

CF 

2 .404 

Q  .  2"  1 

-  m  —  * 

0.35  3 

«  .  —  ~ 

“  .  4  Ll  ^ 

0 . 243 

1 .  3  c  4 

MLE 

2.403 

0.370 

0 . 2"*^ 

0.347 

1  *  " 

2.-133 

0.253 

1 .200 

Parameters 

- J 

•i 

u 

1 

a 

• 

V 

1 

CM 

0.051 

j  .  }  5  5 

1.-25 

-0.533 

2.220 

: .  ""0 

-1.550 

5 . 4.0  0 

CF 

0.103 

0.438 

2.23" 

1.0  35 

a  „  2  3  3 

3.3  57 

0.242 

1.203 

MLE 

0.092 

0.-al 

3.235 

1.079 

4.353 

2.357 

0.242 

1.136 

0- 


Table  1  continued 


u 

X 

a2 

4 

a2 

2 

u 

1 

a\ 

1 

Parameters 

2 

1 

V 

4 

< 

: 

1 

V 

1 

CM 

2.977 

0.601 

-0.194 

-1.211 

1.509 

0 . 7uu 

0.363 

0.957 

CF 

1.363 

0.952 

0.247 

1.004 

1.191 

0.530 

0.253 

1.234 

MLE 

1.362 

0.351 

0.241 

0.952 

1.174 

3 .941 

0.286 

1.232 

Parameters 

0 

* 

V 

: 

4 

2 

V 

i 

CM 

-0.001 

73.-05 

12 . 3-2 

-0.145 

4 . 4a9 

1 .  -35 

-0.455 

0.517 

CF 

-0.155 

0.439 

0.4=0 

1.599 

3.335 

1 .  5  39 

0.195 

0.371 

MLF 

-0 . 134 

0.35- 

0.5  30 

2.011 

3.999 

1.937 

0.131 

0.917 

Parameters 

*"> 

L. 

-> 

V 

s 

2 

V 

4 

1 

CM 

2.45- 

1 . 701 

-0.353 

0.320 

2.012 

«  n  s  -1 

0. 360 

-0.099 

CF 

1.573 

2.497 

0.037 

1.154 

0.959 

2.109 

0.342 

0.956 

-  MLE 

1.334 

2.2-9 

^  *4-1 

J  .  4  •  ^ 

0.933 

0.931 

2.199 

0.319 

0.969 

Parameters 

n 

■j 

2 

V* 

i. 

CM 

-0 .114 

0.493 

1.310 

2.235 

c- 

-0.C40 

1.347 

0.522 

1 .  335 

MLZ 

-C.046 

1.297 

0.547 

1.394 

the  case  where  u  =  u,  A=4,  0^=0^  =  D,  and  even  when  the  variance  parameters 
are  assumed  equal,  effective  estimation  is  not  possible  by  either  method 
when  u  =  0,  \=2,  =  1. 

i.  4. 

The  inability  to  estimate  accurately  when  i  =  0  and  *>1  noted  in 
both  Tables  1  and  2  has  been  further  substantiated  empirically  and  nav 


:e  explained  by  noting  that  the  distribution  function  T(x;h 


ecuat  ter. 


(2.1)  is  nearly  singular  in  this  region  when  regarded  as  a  function  of  its 
parameters.  By  this  is  meant  that  the  distribution  is  strcnzlv  deter. der.t 
only  on  certain  combinations  of  the  parameters  and  so  is  only  very  slightly 
perturbed  as  the  parameter  vector  is  allowed  to  vary  on  the  hypersurf aces 
generated  by  fixing  these  combinations  equal  to  some  constants.  In  the 
case  where  „  =  2,  the  underlying  distribution  is  stmt,  me  trie,  ar.d  its  first 
three  ~cm.er.ts  are  mat ohed  bv  ar.v  symmetric  member  of  the  modified  00m- 


tur.-i  -- c_ 5 5 r 7.  W..C56  r 5>t 2 t*.s 7  r  3  sdtisrv 


<  5 . : ) 


w  r.  e  r  e  <  .  15 


idtion  variance. 


parameters  lie  near  the  hypersurface  (5.1)  -at 


1 1  c  c  r  i  z  u  c  i  c  r.  s  wncse 


mere  r:  re  :e  virma* 


even  tr.cugr.  tneir  parameters  timer  v:nem.  r:r  examm. 

ir.  the  case  cf  Table  1  where  the  true  parameter  values  are  „  =  2,  •:2, 

2  .  2 

z,-K  and  1 ,  each  cr  the  tr.ree  estimation  methocs  rits  tr.e  data  with 
a  curve  which  is  nearly  symmetric  and  which  has  almost  the  same  mean  and 
variance  as  does  the  underlying  population,  as  can  be  verified  by  the  use 
of  equations  (2.1*).  These  fitted  distributions  do  r.ot  differ  greatly  from, 
one  another  or  from  the  parent  distribution,  and  yet  the  corresponding 


k 


TABLE  2 

Comparison  of  Characteristic  Function  (CF)  ar.d 


Maximum 

Likelihood 

(MLE)  Estimates  of 

the  Parameters  cf  the 

Modified  Compound  Poisson  Distribution: 

2  2 

a,  and  7.  Assumed  Ecuai 

1  ^ 

U 

X 

°2 

u  X  a*- 

2 

Parameters 

4 

1 

1 

■>  1  1 

i.  A. 

CF 

2.  930 

0.3  35 

1  4  4^ 

2.1 ”5  1.013  1.125 

yr  T 

^.319 

3.^73 

1.201 

2.119  1.035  1.155 

Parameters 

1 

1 

1 

0  2 

: .  oou 

0.  354 

0.931 

0.25“  0.333  2.340 

MLE 

1.030 

0 . 333 

0.931 

0.021  0.415  0.293 

?  arameters 

4 

3 

1 

2  3 

-r 

3 .  9“*5 

3.055 

0.933 

i  - "  Z,  •;  -n  -  *  *  - ^  ^ 

MLE 

3.342 

3.123 

0.  222 

1.36'  3.139  1.132 

Parameters 

1 

3 

L 

a  4  1 

-  r 

«  -  ^ 

2.313 

1.15  5 

3.2'"  a  .  ”  = 

•  MLE 

1 .  034 

2.324 

1.355 

3.-'  ’.92-  3.5  2  3 

Parameters 

<4 

4 

1 

1  4  * 

CF 

2.423 

3.301 

0.513 

-  *  -  u  4  “  •> u 

MLE 

2.425 

3.311 

0.509 

-  4  4  ^  '  -  r  -  *  ‘  r  a 
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estimates  for  the  parameters  A,  and  are  quite  inaccurate. 

The  singularity  of  the  modified  compound  Poisson  distribution 

with  y  =  0  is  especially  critical  when  A  is  either  large  or  else  extremely 

small,  for  reasons  which  will  be  discussed  shortly.  If  one  is  interested 

only  in  fitting  a  curve  to  data,  then  these  considerations  may  not  be  of 

great  concern.  The  precise  estimation  of  the  values  of  the  individual 
2  2 

parameters  A,  o  and  o ^  when  y  is  very  close  to  zero,  would,  however,  seem 
to  require  impractically  large  samples  unless  A  is  of  moderate  magnitude. 

Other  interrelations  among  the  parameters  also  exist  and  may  cause 
difficulty  in  estimation.  In  some  portions  of  the  parameter  space,  for 
example,  F(x;0)  is  quite  insensitive  to  perturbations  of  the  parameters  on 
the  surface 


where  is  the  population  mean,  and  this  in  turn  causes  the  estimators 
of  A  and  u  to  be  generally  strongly  negatively  correlated,  as  is  evident 
in  both  Tables  1  and  2.  Other  such  relations  are  not  so  obvious.  In 
an  effort  to  gain  some  insight  into  these  interrelations,  and  the  effect 
they  have  on  estimation,  information  matrices  have  been  computed  and  are 
given  in  Bryant  and  Paulson  (1981). 

The  information  matrices  reveal  a  very  complicated  pattern  of 

2  2 

interrelations  among  the  parameters  y,  A,  and  a^.  No  attempt  will 
be  made  here  to  discuss  these  in  complete  detail,  but  rather  only  two 
observations,  which  will  be  of  use  in  Section  6,  will  be  noted.  It  may 
be  seen  that  as  A  increases,  F(x;9)  becomes  nearly  singular,  which  may 
be  accounted  for  by  the  limiting  normality  of  the  standardized  variate 
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X  -  Xu 

/“ 2  w  2  2, 

/  o1  +  X(o  +o2) 

2  2 

as  X-*®  while  u>  o  and  are  kept  fixed.  The  practical  im plication  of 

this  is  that  if  X  is  much  larger  than  the  values  investigated  in  Table  1 

2  2  2 

only  parametric  combinations  which  are  functions  of  Xu  and  o,  +  X(u  +a2) 

may  be  accurately  estimated  without  recourse  to  tremendously  large  samples. 

Conversely,  if  X  is  very  small  the  distribution  is  insensitive  to  the 
2 

parameters  u  and  >  since  then  the  "noise"  variate  Z  of  equation  (1.1)  is 

dominant.  Finally,  these  matrices  indicate  that  the  magnitude  of  the 
2 

variance  a ^  of  Zq  is  a  magcr  factor  in  the  overall  estimability  of  the 

2  2 

parameters.  As  seems  reasonable,  a  small  value  of  o,  relative  to  o 2 
permits  more  precise  estimation  than  would  be  possible  if  this  were-  not 
the  case. 


6.  Designed  Estimation  Experiments 

2  2 

Suppose  it  is  of  interest  to  estimate  the  parameters  9  =  (u,X,a,,a2) 
of  (1.2).  If  ?(t)  can  be  continuously  and  precisely  observed,  the  arrival 
times  of  the  Poisson  process  may  be  recorded  along  with  the  corresponding 
jumps  of  P(t),  so  that  N(t),  Y(t)  and  the  are  observable.  In  this  case, 

estimation  of  the  vector  0  is  straightforward.  However,  it  is  not  difficul 
to  think  of  applications  where  continuous  observation  is  either  not  phys¬ 
ically  possible  or  else  is  not  economically  feasible,  and  yet  the  experi¬ 
menter  does  possess  some  control  over  the  times  at  which  the  process  of 
equation  (1.2)  may  be  observed. 
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Sup  pose  it  has  been  decided  that  a  fixed  number  n  of  observations 
of  the  process  P(t)  are  to  be  used  for  purposes  of  estimation,  and  for 
simplicity  it  is  agreed  that  a  constant  inter-observation  time  At  will  be 
employed.  Then  it  is  in  the  experimenter's  interest  to  determine  a  value 
of  At  which  will  lead  to  parameter  estimates  of  low  variability.  It  should 
not  be  surprising  to  find  that  different  choices  will  lead  to  estimates  of 
varying  quality,  but  the  degree  to  which  this  is  true  turns  out  to  be 
remarkable.  If  reasonable  initial  estimates  of  the  true  parameters  are 
available,  they  may  be  used  (in  a  manner  to  be  described)  to  great  advantage 
in  the  problem  of  selecting  an  appropriate  inter-observation  time. 

Given  some  fixed  At,  the  values  P(At),  p( 2At ) , . . . ,P(nAt)  yield 
by  differencing  the  random  sample 

X.,  =  ?( j  At )  -  ?((j-l)At),  j=l,2,...,n, 

J 

where  F(0)  =  Pq.  These  X^.  have  as  their  common  distribution  FCxifi^),  the 
modified  compound  Poisson  distribution  with  the  parameter  vector 


2  At 


(u,  x 


At’ 


2  2 

where  A ^  =  A At  and  =  o^At.  The  problem  of  estimating  the  parameters 

9  of  the  stochastic  process  is  then  reduced  to  that  of  estimating  9 .  ;  if 
any  of  the  procedures  discussed  in  Sections  2-4  is  used  to  obtain  the 
estimates 


ht 


(u’  ^ At ’  °!At’  a2)j 


then  the  vector  of  process  parameters  is  likewise  estimated  by 


X, 


-2  .2, 

^ .  »  &  2  *  » 


^  <*‘2  2 

where  X  =  X.  /At  and  a,  =  a  /At. 

At  1  lAt 

It  was  seen  in  Section  5  that  the  character  of  F(x;§  )  with 

regard  to  the  estimability  of  its  parameters  is  strongly  influenced  by 

2  .  . 

the  magnitudes  of  X^  and  a,^.  These  quantities  may  be  controlled  by 

the  experimenter  with  proper  selection  of  the  inter-observation  time  At, 

which  in  turn  will  permit  accurate  estimation.  In  fact,  if  this  is  not 

done,  the  near  singularity  of  the  information  matrices  over  much  of  the 

parameter  space  makes  it  clear  that  accurate  estimation  may  not  be  possible. 

Let  9  be  a  vector  of  prior  estimates  of  the  process  parameters, 

and  let  9.  be  the  corresponding  initial  estimates  of  9,  .  If  it  is 
~At  -At 

assumed  momentarily  that  the  estimates  S  have  a  covariance  matrix  not 

too  dissimilar  to  —  1(0.  )  *  where  the  jk  element  of  1(9)  is  given  by 
n  -At  * 


i..  (e.«.)  =  e 

]k  ~At 


3  log  f(x;§At)  3  log  f(x;9At) 

— — 

At]  Atk 


(6.1) 


then  these  information  matrices  may  be  used  in  conjunction  with  the 

initial  estimates  to  approximate,  for  any  given  At,  the  quantities 

nVar(u),  nVar(X)  =  nVar( X )/At2 ,  nVar(o^)  =  nVar(o^A^)/At2  and 
.  2 

nVar(o2).  An  approximation  of  the  generalized  variance  of  9  may  also 
be  useful,  and  is  provided  by  the  quantity  {|l(9(J_)|At  }  .  3y  calcu- 

"Li  c 

lating  these  numbers  for  several  values  of  At,  it  is  possible  to  select 

an  interobservation  time  which  permits  near-optimal  performance  in 

terms  of  the  overall  variability  of  the  estimates  of  the  process  parameters. 

1  -  - 1 

It  should  be  mentioned  here  that  —  1(9^)  may  not  be  a  very 
good  approximation  for  Var(§A  )  even  if  maximum  likelihood  is  used  to 
generate  these  estimators  and  the  number  of  observations  is  large,  since 
the  information  matrices  are  typically  ill-conditioned  so  that  a  very  slow 


approach  to  the  asymptotic  iistribution  must  be  expected.  This,  however, 
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does  not  invalidate  the  procedure  since  the  information  matrices  may  still 
be  regarded  as  indicators  of  estimability ,  at  least  as  long  as  the  prior 
vector  estimate  8  is  not  too  inaccurate. 

To  illustrate  the  suggested  method,  suppose  that  a  process  P(t) 

o  2 

has  true  parameter  values  u=l,  X=5/unit  time,  o“=*s/unit  time  and  o2=l  which 
are  to  be  estimated  based  on  500  observations.  If  the  computations  indi¬ 
cated  above  are  performed  for  several  values  of  At  (using  the  true  para¬ 
meters  as  initial  estimates)  the  data  of  Table  3  will  result.  There  is 
a  clear  indication  that  an  inter-observation  time  of  roughly  0.1  of  one 
time  unit  will  produce  nearly  optimal  estimates  of  the  process  parameters. 

On  the  other  hand,  if  At  were  to  be  arbitrarily  chosen  to  be  1.0  (resulting 
in  an  experiment  which  would  run  ten  times  as  long,  with  sampling  only 
one-tenth  as  often)  effective  estimation  would  be  impossible  without 
dramatically  increasing  the  number  of  observations. 

Four  simulated  samples  of  size  500,  whose  parameters  correspond 
to  inter-observation  times  of  1.0,  0.2,  0.1  and  0.05  time  units,  were 
generated  and  their  parameters  estimated  by  means  of  the  characteristic 
function  algorithm.  The  same  stream  of  random  numbers  was  used  in  the 
creation  of  each  of  the  samples  to  facilitate  the  comparison  of  data  insofar 
as  possible.  Results  are  displayed  in  Table  4,  ar.d  these  show  that 
reasonable  estimates  are  achieved  for  At  in  the  range  from  0.05  to  0.2, 
while  as  predicted  the  solution  obtained  for  an  inter-observation  tine  of 
1.0  is  quite  inaccurate. 
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TABLE  3 

Effect  of  Inter-Cbservation  Tine 
on  Parameter  Estimation 
u=l  \=5  a22=0.5  o22= 1 


At 

u 

>At 

2 

°lAt 

2 

*2 

nVar( u ) 

nVar(A  ) 
At 

nVar( A ) 

nVar(o^,  ) 

1  At 

n7ar(o^ ) 

nVarf  o 

1.0 

1 

5 

0.5 

1 

112.92 

2824.7 

2824.7 

1446.3 

1446.3 

59.551 

0.2 

1 

1 

0.1 

1 

6.0847 

6.1973 

154.33 

0.20932 

5.2330 

•'.'•650 

0.1 

1 

0.5 

0.05 

1 

5.3150 

1.3424 

134.24 

0.019243 

1 .  9249 

9.1207 

0.05 

1 

0.25 

0.025 

A 

X 

5.9360 

0.43382 

173.53 

0 . 002496'’ 

0.99869 

12.359 

’At 

4 

1  X 

At 

loLt 

\*\ 

I  I ( 9  )!_1/(At)U 

—At 

1.0 

1 

5 

0.5 

1 

1  I 

0.27SX107 

0.279xl07 

0.2 

1 

1 

0.1 

i  !i 

0. 115x102 

0.7i6xiOU 

0.1 

1 

to 

o 

0.05 

i  !i 

0.573x10° 

0. 573xl04 

1 

0.051 

1 

0.25 

0.025 

i  1 

0.724X10-1 

O.U6xl05 

TA3LE  4 


Effect  of  Inter-Ofcservaticn  Tine  cn 
Parameter  Estimation  -  Simulation  results 


\  =  5 

o^=0. 5 

V1 

n  =  500 

u 

*At 

-2  | 

Ut  a2  | 

1 

u 

\ 

-2 

-2 

a2 

1.646 

3.042 

1.791  0.000 

1.546 

3.042 

1.791 

0.000 

0.900 

1.0796 

0.0995  1.065 

0.900 

5  .  393 

0 . 440 

1.065 

0.938 

0. 5292 

0.0465  1.047  ] 

1  0.939 

5.232 

0.465 

1 .047 

1.014 

0.2241 

0.0252  0.933  ! 

I  1.014 

4.432 

0.504 

0.933 

-30- 


While  it  is  true  that  At  was  selected  with  prior  knowledge  of 
the  true  parameters  in  this  example,  it  is  also  evident  that  it  need  not 
be  chosen  with  extreme  precision  in  order  to  gain  acceptable  results. 
Thus  it  is  believed  that  this  method  can  be  used  with  success  in  con¬ 
junction  with  reasonable  initial  estimates.  A  simple,  useful  rule  of 
thumb  is  the  following:  if  the  arrival  rate  is  A  per  unit  time,  then 
the  sampling  rate  should  be  (2X)  1  per  unit  time.  The  use  of  such  a 
selection  of  inter-observation  time  can  lead  to  much  enhanced  parameter 
estimates . 
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